Overview
--------

The data contained herein were developed by K. Myers and L. Lanahan as a part of their service on a National Academies of Sciences, Engineering, and Medicine (NASEM) review panel (with very helpful assistance from colleagues on the panel, NASEM staff, and Department of Energy staff). The panel conducted the "Review of the Small Business Innovation Research and Small Business Technology Transfer Programs at the Department of Energy" (NASEM, 2020).

A full description of this NASEM review panel, as well as the resulting report, is available at https://www.nationalacademies.org/our-work/review-of-the-small-business-innovation-research-and-small-business-technology-transfer-programs-at-the-department-of-energy#sectionCommittee (link active as of Jan. 1, 2022).


File List
---------

## SBIR AWARD DATA ##
1. sbir_awards.dta: Contains partially cleaned and standardized data from the Small Business Administration (SBA) public-facing data on SBIR/STTR awards, collected from https://www.sbir.gov/sbirsearch/award/all in 2020.

## FIRM NAME DISAMBIGUATION AND CROSSWALK DATA ##
2. IDxwalknames_SBIR.dta: Contains disambiguated firm identifiers based on the names of firms in the SBIR data.
3. IDxwalknames_patent.dta: Contains disambiguated firm identifiers based on the names of assignee organizations in the USPTO patent data obtained from PatentsView.org (https://patentsview.org) in 2020.
4. IDxwalknames_master.dta: Contains the master crosswalk between disambiguated firm names in the SBIR data and in the patent data. 

## FIRM NAME DISAMBIGUATION AND CROSSWALK DATA ##
5. foacpcsim_1997to2004_3gram.tsv: Contains the cosine similarity crosswalk between CPC classifications and the Department of Energy's Funding Opportunity Announcement (FOA) topics from 1997 to 2018 based on the textual overlap of the FOA topic texts and the abstracts and titles of patents awarded between 1997 and 2004. See NASEM (2020) for much more detail.
6. foacpc3sim_1997to2004_3gram.dta: Contains a transformed version of the similarity data described in (5.) focusing only on CPC codes at the main group level.

## FOA TOPICS ##
7. doetopics/...: Contains a directory of the PDF texts of the DoE'S SBIR FOAs are included along with the manually converted text files, which were constructed by hiring workers on the Freelancer platform in the Fall 2020.




